Supervised Visual Attention for Simultaneous Multimodal Machine Translation
نویسندگان
چکیده
Recently, there has been a surge in research multimodal machine translation (MMT), where additional modalities such as images are used to improve quality of textual systems. A particular use for systems is the task simultaneous translation, visual context shown complement partial information provided by source sentence, especially early phases translation. In this paper, we propose first Transformer-based MMT architecture, which not previously explored field. Additionally, extend model with an auxiliary supervision signal that guides its attention mechanism using labelled phrase-region alignments. We perform comprehensive experiments on three language directions and conduct thorough quantitative qualitative analyses both automatic metrics manual inspection. Our results show (i) supervised consistently improves models, (ii) fine-tuning loss enabled leads better performance than training from scratch. Compared state-of-the-art, our proposed achieves improvements up 2.3 BLEU 3.5 METEOR points.
منابع مشابه
Multimodal Attention for Neural Machine Translation
The attention mechanism is an important part of the neural machine translation (NMT) where it was reported to produce richer source representation compared to fixed-length encoding sequence-to-sequence models. Recently, the effectiveness of attention has also been explored in the context of image captioning. In this work, we assess the feasibility of a multimodal attention mechanism that simult...
متن کاملNeural Machine Translation with Supervised Attention
The attention mechanisim is appealing for neural machine translation, since it is able to dynamically encode a source sentence by generating a alignment between a target word and source words. Unfortunately, it has been proved to be worse than conventional alignment models in aligment accuracy. In this paper, we analyze and explain this issue from the point view of reordering, and propose a sup...
متن کاملAttention-based Multimodal Neural Machine Translation
We present a novel neural machine translation (NMT) architecture associating visual and textual features for translation tasks with multiple modalities. Transformed global and regional visual features are concatenated with text to form attendable sequences which are dissipated over parallel long short-term memory (LSTM) threads to assist the encoder generating a representation for attention-bas...
متن کاملSupervised Attentions for Neural Machine Translation
In this paper, we improve the attention or alignment accuracy of neural machine translation by utilizing the alignments of training sentence pairs. We simply compute the distance between the machine attentions and the “true” alignments, and minimize this cost in the training procedure. Our experiments on large-scale Chinese-to-English task show that our model improves both translation and align...
متن کاملSemi-Supervised Learning for Neural Machine Translation
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation. Since parallel corpora are usually limited in quantity, quality, and coverage, especially for low-resource languages, it is appealing to exploit monolingual corpora to improve NMT. We propose a semisupervised approach for training NMT model...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Artificial Intelligence Research
سال: 2022
ISSN: ['1076-9757', '1943-5037']
DOI: https://doi.org/10.1613/jair.1.13546